Effect of Log-Based Query Term Expansion on Retrieval Effectiveness in Patent Searching
نویسندگان
چکیده
In this paper we study the impact of query term expansion (QTE) using synonyms on patent document retrieval. We use an automatically generated lexical database from USPTO query logs, called PatNet, which provides synonyms and equivalents for a query term. Our experiments on the CLEF-IP 2010 benchmark dataset show that automatic query expansion using PatNet tends to decrease or only slightly improve the retrieval effectiveness, with no significant improvement. An analysis of the retrieval results shows that PatNet does not have generally a negative effect on the retrieval effectiveness. Recall is drastically improved for query topics, where the baseline queries achieve, on average, only low recall values. But we have not detected any commonality that allows us to characterize these queries. So we recommend using PatNet for semi-automatic QTE in Boolean retrieval, where expanding query terms with synonyms and equivalents with the aim of expanding the query scope is a common practice.
منابع مشابه
Achieving Effective Multi-term Queries for Fast DHT Information Retrieval
Distributed Hash Tables (DHTs) are well-suited for exact match lookups using unique identifiers, but do not directly support multi-term queries. Related research of query expansion has shown that adding new terms to a query via ad hoc feedback improves the retrieval effectiveness of such query. In the paper, we propose an effective multi-term query processing algorithm for information retrieval...
متن کاملQEA: A New Systematic and Comprehensive Classification of Query Expansion Approaches
A major problem in information retrieval is the difficulty to define the information needs of user and on the other hand, when user offers your query there is a vast amount of information to retrieval. Different methods , therefore, have been suggested for query expansion which concerned with reconfiguring of query by increasing efficiency and improving the criterion accuracy in the information...
متن کاملEnhancing passage retrieval in log files by query expansion based on explicit and pseudo relevance feedback
Passage retrieval is usually defined as the task of searching for passages which may contain the answer for a given query. While these approaches are very e cient when dealing with texts, applied to log files (i.e. semi-structured data containing both numerical and symbolic information) they usually provide irrelevant or useless results. Nevertheless one appealing way for improving the results ...
متن کاملVector-Based Semantic Expansion Approach: an Application to Patent Retrieval Master’s Thesis
Patent collection is increasing incrementally. Most of the new technological information is from patent documents, and retrieving specific patents in such a large pool of documents has become a challenging issue for both patent examiners and normal/inexperienced users. Nowadays, most of the retrieval systems (patent retrieval systems) search for the exact match of the query string inside the co...
متن کاملUse of Controlled Vocabularies to Improve Biomedical Information Retrieval Tasks
The high heterogeneity of biomedical vocabulary is a major obstacle for information retrieval in large biomedical collections. Therefore, using biomedical controlled vocabularies is crucial for managing these contents. We investigate the impact of query expansion based on controlled vocabularies to improve the effectiveness of two search engines. Our strategy relies on the enrichment of users' ...
متن کامل